Combining Hidden Markov Models and Stochastic-Context Free Grammars
نویسنده
چکیده
Currently, gene finding and RNA secondary structure prediction are performed separately. This is only valid if, for any DNA sequence the location of genes and corresponding RNA secondary structure can assumed to be independent. I develop methods that take a hidden Markov model for gene finding and a stochastic context-free grammar for RNA secondary structure prediction and from these, construct a combined model which can make joint predictions. I show that in general, the combined model is a stochastic context-free grammar and compare how an example performs relative to the separate models when data is simulated from the combined model. An investigation of the combined model reveals that apparent dependence can be an artefact of the model rather than dependence present in the data. My results are inconclusive and further work is required to determine whether future implementation of a the joint model is appropriate and feasible on DNA sequence data.
منابع مشابه
Stochastic Tree-Adjoining Grammars
A B S T R A C T The notion of stochastic lexicalized tree-adjoining grammar (SLTAG) is defined and basic algorithms for SLTAG are designed. The parameters of a SLTAG correspond to the probability of combining two structures each one associated with a word. The characteristics of SLTAG are unique and novel since it is lexically sensitive (as N-gram models or Hidden Markov Models) and yet hierarc...
متن کاملRecognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models
This paper describes a formal model for the recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models. Hidden Markov models are used to recognize mathematical symbols, and a stochastic context-free grammar is used to model the relation between these symbols. This formal model makes possible to use classic algorithms for parsin...
متن کاملCombining Hidden Markov Models and Stochastic-Context Free Grammars
Currently, gene finding and RNA secondary structure prediction are performed separately. This is only valid if, for any DNA sequence the location of genes and corresponding RNA secondary structure can assumed to be independent. I develop methods that take a hidden Markov model for gene finding and a stochastic context-free grammar for RNA secondary structure prediction and from these, construct...
متن کاملRecent Advances of Grammatical Inference
In this paper, we provide a survey of recent advances in the field “Grammatical Inference” with a particular emphasis on the results concerning the learnability of target classes represented by deterministic finite automata, context-free grammars, hidden Markov models, stochastic contextfree grammars, simple recurrent neural networks, and case-based representations.
متن کاملStochastic Lexicalized Tree-adjoining Grammars
The notion of stochastic lexicalized tree-adjoining g rammar (SLTAG) is formally defined. The parameters of a SLTAG correspond to the probability of combining two structures each one associated with a word. The characteristics of SLTAG are unique and novel since it is lexieally sensitive (as N-gram models or Hidden Markov Models) and yet hierarchical (as stochastic context-free grammars) . Then...
متن کامل